{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "# Build a RAG chain using Zilliz cloud and AWS bedrock\n",
    "Combine [Zilliz cloud](https://zilliz.com/cloud) and [Aws bedrock](https://aws.amazon.com/bedrock/) to build a RAG(Retrieval-Augmented Generation) chain through [Langchain](https://langchain.com/) framework.\n",
    "\n",
    "The RAG chain consists of a retriever for retrieving relevant documents from a vector store, a prompt template for generating the input prompt, a language model for generating the AI response, and an output parser for formatting the generated response.\n",
    "\n",
    "Zilliz cloud is used for vector storage and retrieval, and AWS bedrock is used for supporting language models and embedding models. Langchain is used to connect the components and build the RAG chain."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Prerequisites\n",
    "Install the required packages and set the required environment variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# ! pip install --upgrade --quiet  langchain langchain-core langchain-text-splitters langchain-community langchain-aws bs4 boto3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import bs4\n",
    "import boto3\n",
    "from langchain_aws import ChatBedrock\n",
    "from langchain_community.vectorstores import Zilliz\n",
    "from langchain_community.document_loaders import WebBaseLoader\n",
    "from langchain_community.embeddings import BedrockEmbeddings\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
    "from langchain_core.runnables import RunnablePassthrough\n",
    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
    "from langchain.prompts import PromptTemplate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "Set the required environment variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# Set the AWS region and access key environment variables\n",
    "REGION_NAME = \"us-east-1\"\n",
    "AWS_ACCESS_KEY_ID = os.getenv(\"AWS_ACCESS_KEY_ID\")\n",
    "AWS_SECRET_ACCESS_KEY = os.getenv(\"AWS_SECRET_ACCESS_KEY\")\n",
    "\n",
    "# Set ZILLIZ cloud environment variables\n",
    "ZILLIZ_CLOUD_URI = os.getenv(\"ZILLIZ_CLOUD_URI\")\n",
    "ZILLIZ_CLOUD_API_KEY = os.getenv(\"ZILLIZ_CLOUD_API_KEY\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "The zilliz cloud uri and zilliz api key can be obtained from the [Zilliz cloud console guide](https://docs.zilliz.com/docs/on-zilliz-cloud-console).\n",
    "\n",
    "In simple terms, you can access them on your zilliz cloud cluster page.\n",
    " ![](../../images/zilliz_uri_and_key.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Create LLM and Embedding models using aws bedrock\n",
    "Create an aws bedrock instance and deploy language models and embedding models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# Create a boto3 client with the specified credentials\n",
    "client = boto3.client(\n",
    "    \"bedrock-runtime\",\n",
    "    region_name=REGION_NAME,\n",
    "    aws_access_key_id=AWS_ACCESS_KEY_ID,\n",
    "    aws_secret_access_key=AWS_SECRET_ACCESS_KEY,\n",
    ")\n",
    "\n",
    "# Initialize the ChatBedrock instance for language model operations\n",
    "llm = ChatBedrock(\n",
    "    client=client,\n",
    "    model_id=\"anthropic.claude-3-sonnet-20240229-v1:0\",\n",
    "    region_name=REGION_NAME,\n",
    "    model_kwargs={\"temperature\": 0.1},\n",
    ")\n",
    "\n",
    "# Initialize the BedrockEmbeddings instance for handling text embeddings\n",
    "embeddings = BedrockEmbeddings(client=client, region_name=REGION_NAME)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Load documents and split them into chunks\n",
    "We use the Langchain WebBaseLoader to load documents from web sources and split them into chunks using the RecursiveCharacterTextSplitter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# Create a WebBaseLoader instance to load documents from web sources\n",
    "loader = WebBaseLoader(\n",
    "    web_paths=(\"https://lilianweng.github.io/posts/2023-06-23-agent/\",),\n",
    "    bs_kwargs=dict(\n",
    "        parse_only=bs4.SoupStrainer(\n",
    "            class_=(\"post-content\", \"post-title\", \"post-header\")\n",
    "        )\n",
    "    ),\n",
    ")\n",
    "# Load documents from web sources using the loader\n",
    "documents = loader.load()\n",
    "# Initialize a RecursiveCharacterTextSplitter for splitting text into chunks\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)\n",
    "\n",
    "# Split the documents into chunks using the text_splitter\n",
    "docs = text_splitter.split_documents(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Create the RAG chain and invoke it\n",
    "We use the Langchain framework to create the RAG chain. The RAG chain consists of a retriever for retrieving relevant documents from a Zilliz vector store. When invoke `from_documents` function, the retriever will automatically create a Zilliz vector store from the loaded documents and embeddings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# Define the prompt template for generating AI responses\n",
    "PROMPT_TEMPLATE = \"\"\"\n",
    "Human: You are a financial advisor AI system, and provides answers to questions by using fact based and statistical information when possible.\n",
    "Use the following pieces of information to provide a concise answer to the question enclosed in <question> tags.\n",
    "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n",
    "<context>\n",
    "{context}\n",
    "</context>\n",
    "\n",
    "<question>\n",
    "{question}\n",
    "</question>\n",
    "\n",
    "The response should be specific and use statistics or numbers when possible.\n",
    "\n",
    "Assistant:\"\"\"\n",
    "# Create a PromptTemplate instance with the defined template and input variables\n",
    "prompt = PromptTemplate(\n",
    "    template=PROMPT_TEMPLATE, input_variables=[\"context\", \"question\"]\n",
    ")\n",
    "\n",
    "# Initialize Zilliz vector store from the loaded documents and embeddings\n",
    "vectorstore = Zilliz.from_documents(\n",
    "    documents=docs,\n",
    "    embedding=embeddings,\n",
    "    connection_args={\n",
    "        \"uri\": ZILLIZ_CLOUD_URI,\n",
    "        \"token\": ZILLIZ_CLOUD_API_KEY,\n",
    "        \"secure\": True,\n",
    "    },\n",
    "    auto_id=True,\n",
    "    drop_old=True,\n",
    ")\n",
    "\n",
    "# Create a retriever for document retrieval and generation\n",
    "retriever = vectorstore.as_retriever()\n",
    "\n",
    "\n",
    "# Define a function to format the retrieved documents\n",
    "def format_docs(docs):\n",
    "    return \"\\n\\n\".join(doc.page_content for doc in docs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Self-reflection is a vital capability that allows autonomous AI agents to improve iteratively by analyzing and refining their past actions, decisions, and mistakes. Some key aspects of self-reflection for AI agents include:\n",
      "\n",
      "1. Evaluating the efficiency and effectiveness of past reasoning trajectories and action sequences to identify potential issues like inefficient planning or hallucinations (generating consecutive identical actions without progress).\n",
      "\n",
      "2. Synthesizing observations and memories from past experiences into higher-level inferences or summaries to guide future behavior.\n",
      "\n",
      "3. Generating reflective questions based on recent observations and attempting to answer those questions to gain insights.\n",
      "\n",
      "4. Incorporating reflections into the agent's working memory to provide additional context for querying the language model and adjusting future plans.\n",
      "\n",
      "For example, in the Reflexion framework (Shinn & Labash 2023), the agent computes a heuristic after each action to determine if the current trajectory is inefficient or contains hallucinations. If so, it can reset the environment and start a new trial, utilizing the self-reflections added to its memory to adjust its planning and reasoning process.\n"
     ]
    }
   ],
   "source": [
    "# Define the RAG (Retrieval-Augmented Generation) chain for AI response generation\n",
    "rag_chain = (\n",
    "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
    "    | prompt\n",
    "    | llm\n",
    "    | StrOutputParser()\n",
    ")\n",
    "\n",
    "# rag_chain.get_graph().print_ascii()\n",
    "\n",
    "# Invoke the RAG chain with a specific question and retrieve the response\n",
    "res = rag_chain.invoke(\"What is self-reflection of an AI Agent?\")\n",
    "print(res)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Looks great, the RAG chain answered the question through using relevant knowledge context from the web page and provide concise and correctly answers."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}